首页> 外文OA文献 >Pushing Stochastic Gradient towards Second-Order Methods -- Backpropagation Learning with Transformations in Nonlinearities
【2h】

Pushing Stochastic Gradient towards Second-Order Methods -- Backpropagation Learning with Transformations in Nonlinearities

机译:推动随机梯度向二阶方法推进 -   非线性变换的反向传播学习

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Recently, we proposed to transform the outputs of each hidden neuron in amulti-layer perceptron network to have zero output and zero slope on average,and use separate shortcut connections to model the linear dependencies instead.We continue the work by firstly introducing a third transformation to normalizethe scale of the outputs of each hidden neuron, and secondly by analyzing theconnections to second order optimization methods. We show that thetransformations make a simple stochastic gradient behave closer to second-orderoptimization methods and thus speed up learning. This is shown both in theoryand with experiments. The experiments on the third transformation show thatwhile it further increases the speed of learning, it can also hurt performanceby converging to a worse local optimum, where both the inputs and outputs ofmany hidden neurons are close to zero.
机译:最近,我们提出将多层感知器网络中每个隐藏神经元的输出转换为平均具有零输出和零斜率的方法,并使用单独的快捷方式连接来建模线性依赖关系。标准化每个隐藏神经元输出的规模,其次通过分析与二阶优化方法的联系。我们表明,该变换使简单的随机梯度的行为更接近于二阶优化方法,从而加快了学习速度。理论上和实验上都显示了这一点。第三次变换的实验表明,尽管它进一步提高了学习速度,但同时也会收敛到更差的局部最优值,这会损害性能,在局部最优值中,许多隐藏神经元的输入和输出都接近于零。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号